# Speeding Up EEGUnity Processing with Multithreading ## 1. Introduction This tutorial shows you how to accelerate EEG data processing in EEGUnity using Python’s built-in multithreading support from the `concurrent` library. By processing EEG data in parallel, you can significantly reduce the time it takes to export large datasets. If you're new to multithreading in Python, check out the [official Python documentation on `concurrent.futures`](https://docs.python.org/3/library/concurrent.futures.html) or this helpful [Real Python guide on multithreading and multiprocessing](https://realpython.com/python-concurrency/). --- ## 2. When to Use Multithreading Multithreading is useful when you need to perform many similar tasks that are not CPU-bound—for example, exporting EEG datasets that involve I/O operations. EEGUnity allows grouping datasets by domain, which makes it easy to split tasks for parallel processing. --- ## 3. Step-by-Step Guide ### Step 1: Group Dataset by Domain Use EEGUnity’s built-in method to group the dataset before parallel processing: ```python from eegunity import UnifiedDataset u_dataset = UnifiedDataset(locator_path=locator_path) u_dataset_list = u_dataset.group_by_domain() # Returns a list of UnifiedDataset instances ``` --- ### Step 2: Define the Task Function This function handles the export process for each grouped dataset: ```python def export_task(u_dataset_grouped): domain_tag = u_dataset_grouped.get_locator().iloc[0]['Domain Tag'] print(f"[START] Exporting: {domain_tag}") u_dataset_grouped.eeg_batch.export_h5Dataset(output_path, name=domain_tag) print(f"[DONE ] Exported: {domain_tag}") ``` --- ### Step 3: Run Tasks with `ThreadPoolExecutor` Use a thread pool to run the export tasks in parallel: ```python from concurrent.futures import ThreadPoolExecutor, as_completed # Limit the number of threads to avoid system overload max_threads = min(8, len(u_dataset_list)) with ThreadPoolExecutor(max_workers=max_threads) as executor: futures = [executor.submit(export_task, group) for group in u_dataset_list] for future in as_completed(futures): try: future.result() # Wait for task to complete and raise exceptions if any except Exception as e: print(f"Task failed: {e}") ``` --- ## 4. Summary Using `ThreadPoolExecutor` with grouped EEGUnity datasets is a convenient way to speed up exporting. Make sure your task is I/O-bound or lightweight enough to benefit from multithreading. If your tasks are CPU-intensive, consider using `ProcessPoolExecutor` instead. --- ## 5. Learn More - [Python `concurrent.futures` Documentation](https://docs.python.org/3/library/concurrent.futures.html) - [Multithreading vs Multiprocessing in Python (Real Python)](https://realpython.com/python-concurrency/) - [EEGUnity GitHub Repository](https://github.com/935963004/LaBraM) - Source for EEGUnity